Supervised classification of news articles by whether they mention diseases and outbreaks
نویسنده
چکیده
Global Viral Forecasting is working to use data extracted from humanreadable web documents in order to predict when and where disease outbreaks will happen. As part of this effort, it is necessary to classify documents by whether they are relevant to this prediction process. For this reason, GVF has collected 75,176 manual annotations of web documents. These annotations include which of the following mutually exclusive classes the document falls into:
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملLost in Space: Geolocation in Event Data
Extracting the “correct” location information from text data, i.e., determining the place of event, has long been a goal for automated text processing. To approximate human-like coding schema, we introduce a supervised machine learning algorithm that classifies each location word to be either correct or incorrect. We use news articles collected from around the world (Integrated Crisis Early War...
متن کاملFinding Bias in Political News and Blog Websites
News and blog websites often have political bias (such as Republican, Democratic) in their articles. Automatic detection of the bias will improve personalized feed and categorization of news and blog articles. Our project aims to predict Republican vs. Democratic bias of news websites and political blogs using the phrases (a.k.a. memes) they quote in their text. We form a bipartite graph of web...
متن کاملCorporate News Classification and Valence Prediction: A Supervised Approach
News articles have always been a prominent force in the formation of a company’s financial image in the minds of the general public, especially the investors. Given the large amount of news being generated these days through various websites, it is possible to mine the general sentiment of a particular company being portrayed by media agencies over a period of time, which can be utilized to gau...
متن کاملTemporal Topic Modeling to Assess Associations between News Trends and Infectious Disease Outbreaks
In retrospective assessments, internet news reports have been shown to capture early reports of unknown infectious disease transmission prior to official laboratory confirmation. In general, media interest and reporting peaks and wanes during the course of an outbreak. In this study, we quantify the extent to which media interest during infectious disease outbreaks is indicative of trends of re...
متن کامل